Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7411] [SQL] Support SerDe for HiveQl in CTAS #5963

Closed

Conversation

chenghao-intel
Copy link
Contributor

This is a follow up of #5876 and should be merged after #5876.

Let's wait for unit testing result from Jenkins.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32072 has started for PR 5963 at commit a8575a0.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32072 has finished for PR 5963 at commit a8575a0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CreateTableAsSelect(

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32072/
Test FAILed.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32077 has started for PR 5963 at commit 889d822.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32077 has finished for PR 5963 at commit 889d822.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class CreateTableAsSelect(

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32077/
Test PASSed.

@yhuai
Copy link
Contributor

yhuai commented May 8, 2015

@chenghao-intel I have merged #5876.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 9, 2015

Test build #32302 has started for PR 5963 at commit f4e243f.

@SparkQA
Copy link

SparkQA commented May 9, 2015

Test build #32302 has finished for PR 5963 at commit f4e243f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32302/
Test PASSed.

// parquet.hive.DeprecatedParquetInputFormat => Parquet
// TODO configurable?
format.contains("Orc") || format.contains("Parquet") || format.contains("RCFile")
}).getOrElse(false))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is really the logic we want. The goal here is to by default (i.e. if the user does not specify anything about storage), when convertCTAS is turned on to use the data sources API. Would it be possible to have the parser only fill in the storage options when the user specifies them and defer filling in default values until we are in the analyzer. That way we can distinguish "no storage options specified" from "default storage options chosen".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see, I will move the default SerDe from HiveQl to Analyzer

@marmbrus
Copy link
Contributor

marmbrus commented May 9, 2015

This is looking pretty good. Thanks for taking the time to flesh this part out.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 10, 2015

Test build #32343 has started for PR 5963 at commit a8260e8.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 10, 2015

Test build #32344 has started for PR 5963 at commit f87ace6.

@chenghao-intel chenghao-intel changed the title [SPARK-7411] [SQL] [WIP]Support SerDe for HiveQl in CTAS [SPARK-7411] [SQL] Support SerDe for HiveQl in CTAS May 10, 2015
@SparkQA
Copy link

SparkQA commented May 10, 2015

Test build #32343 has finished for PR 5963 at commit a8260e8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32343/
Test PASSed.

@SparkQA
Copy link

SparkQA commented May 10, 2015

Test build #32344 timed out for PR 5963 at commit f87ace6 after a configured wait of 150m.

@AmplabJenkins
Copy link

Merged build finished. Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32344/
Test FAILed.

@chenghao-intel
Copy link
Contributor Author

retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32355 has started for PR 5963 at commit f87ace6.

@SparkQA
Copy link

SparkQA commented May 11, 2015

Test build #32355 has finished for PR 5963 at commit f87ace6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Merged build finished. Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/32355/
Test PASSed.

@chenghao-intel
Copy link
Contributor Author

@marmbrus @yhuai Any more comments?

asfgit pushed a commit that referenced this pull request May 12, 2015
This is a follow up of #5876 and should be merged after #5876.

Let's wait for unit testing result from Jenkins.

Author: Cheng Hao <[email protected]>

Closes #5963 from chenghao-intel/useIsolatedClient and squashes the following commits:

f87ace6 [Cheng Hao] remove the TODO and add `resolved condition` for HiveTable
a8260e8 [Cheng Hao] Update code as feedback
f4e243f [Cheng Hao] remove the serde setting for SequenceFile
d166afa [Cheng Hao] style issue
d25a4aa [Cheng Hao] Add SerDe support for CTAS

(cherry picked from commit e35d878)
Signed-off-by: Michael Armbrust <[email protected]>
@asfgit asfgit closed this in e35d878 May 12, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
This is a follow up of apache#5876 and should be merged after apache#5876.

Let's wait for unit testing result from Jenkins.

Author: Cheng Hao <[email protected]>

Closes apache#5963 from chenghao-intel/useIsolatedClient and squashes the following commits:

f87ace6 [Cheng Hao] remove the TODO and add `resolved condition` for HiveTable
a8260e8 [Cheng Hao] Update code as feedback
f4e243f [Cheng Hao] remove the serde setting for SequenceFile
d166afa [Cheng Hao] style issue
d25a4aa [Cheng Hao] Add SerDe support for CTAS
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
This is a follow up of apache#5876 and should be merged after apache#5876.

Let's wait for unit testing result from Jenkins.

Author: Cheng Hao <[email protected]>

Closes apache#5963 from chenghao-intel/useIsolatedClient and squashes the following commits:

f87ace6 [Cheng Hao] remove the TODO and add `resolved condition` for HiveTable
a8260e8 [Cheng Hao] Update code as feedback
f4e243f [Cheng Hao] remove the serde setting for SequenceFile
d166afa [Cheng Hao] style issue
d25a4aa [Cheng Hao] Add SerDe support for CTAS
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
This is a follow up of apache#5876 and should be merged after apache#5876.

Let's wait for unit testing result from Jenkins.

Author: Cheng Hao <[email protected]>

Closes apache#5963 from chenghao-intel/useIsolatedClient and squashes the following commits:

f87ace6 [Cheng Hao] remove the TODO and add `resolved condition` for HiveTable
a8260e8 [Cheng Hao] Update code as feedback
f4e243f [Cheng Hao] remove the serde setting for SequenceFile
d166afa [Cheng Hao] style issue
d25a4aa [Cheng Hao] Add SerDe support for CTAS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants